Try Our New BETA Version
GO
Universal Access To All Knowledge
Home Donate | Store | Blog | FAQ | Jobs | Volunteer Positions | Contact | Bios | Forums | Projects | Terms, Privacy, & Copyright
Search: Advanced Search
Anonymous User (login or join us)
Upload
Search Results
Results: 1 through 50 of 256 (0.236 secs)
You searched for: subject:"webcrawl"
[1] 2 3 4 5 6     Next    Last
[unknown]cr.opensolaris.org
The code review section of opensolaris.org
Keywords: webcrawl
Downloads: 50
[unknown]blog.memolane.com
The memolane blog saved on the day they shut down.
Keywords: webcrawl
Downloads: 246
[texts]Major League Soccer Talk capture
This is a capture of the website Major League Soccer Talk from 9 April 2013, which includes the posts and their accompanying comments. This website ran from 2007 until 2013. From the closure announcement:"After launching in 2007, Major League Soccer Talk (aka MLS Talk) is coming to an end. In the next two weeks, we’ll be shutting down the website and consolidating all of the articles under the flagship website EPLTalk.com...
Keywords: webcrawl
Downloads: 16
[texts]American Soccer History Archives panic grab
Panic grab of the American Soccer History Archives, http://homepages.sover.net/~spectrum/index.html, on 10 April 2013 since it does not look as though the site has been updated in over a year. From the site's description: "The American Soccer History Archives are a comprehensive repository of information, statistics and essays relating to all aspects of the history of soccer in the United States from the 1860's to the present...
Keywords: webcrawl
Downloads: 9
[unknown]Warner Home Video - Space Jam
The official Warner Home video site for the 1996 film Space Jam starring Michael Jordan. This website has entered into the pantheon of internet legend for being completely unchanged in design and content in nearly 20 years.
Keywords: webcrawl
Downloads: 1,277
[collection]survey_com00000
Survey crawl of .com domains started January 2011.
Keywords: webcrawl
Downloads: 73,275,084
[collection]Focused Crawls - Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Keywords: webcrawl
Downloads: 159,621,423
[collection]survey_net00000
Survey crawl of .net domains started December 2010.
Keywords: webcrawl
Downloads: 7,213,929
[collection]Alexa Crawls
Crawl data donated by Alexa Internet. This data is currently not publicly accessible. Decryption Keys are kept in an item. Alexa is the leading provider of free, global web metrics. Search Alexa to discover the most successful sites on the web by keyword, category, or country.
Keywords: webcrawl
Downloads: 1,256,706,986
[unknown]actionunleashed.ign.com
A backup of the actionunleased subdomain of ign.com before they shut down.
Keywords: webcrawl; archiveteam
Downloads: 10
[unknown]mail.opensolaris.org
This is a backup of all the mailing list archives located on mail.opensolaris.org. It spans 242 mailing lists containing a total of 8,476 months of archives.
Keywords: webcrawl; archiveteam
Downloads: 82
[unknown]rbelmont.mameworld.info.warc
A crawl of http://rbelmont.mameworld.info/, taken 2013-05-22.
Keywords: webcrawl; archiveteam
Downloads: 16
[unknown]Hydriz's personal identifier
This identifier is owned by Hydriz. This identifier aims to allow Hydriz to temporarily store files that will go into the respective identifiers on the Internet Archive in the end, but not at the current point in time. Therefore, all files that are in this identifier can disappear anytime, but in any case it will always hold files that the Internet Archive would like to have, just that its not sorted out at the moment.
Keywords: hydriz; webcrawl
Downloads: 772
[unknown]North American Old Catholic Church
This is a mirror of http://www.naoldcatholic.com/
Keywords: webcrawl; archiveteam
Downloads: 30
[unknown]Twit Cleaner
A backup of the Twit Cleaner service right after announcing they were shutting down due to the twitter api changes.
Keywords: webcrawl; archiveteam
Downloads: 56
[texts]Tribes Forum Grab
Emergency grab of tribes and other games forums before the shutdown
Keywords: webcrawl; archiveteam
Downloads: 11
[texts]telinco.co.uk user pages
A backup of http://telinco.co.uk before their closing on 2013-04-30.
Keywords: webcrawl; archiveteam
Downloads: 14
[unknown]A backup of arena.gamespy.com
This is a panic grab of arena.gamespy.com from early 2013.
Keywords: archiveteam; webcrawl
Downloads: 4
[texts]An assortment of warcs made in 2013
An assortment of warcs made in 2013. Part 1.
Keywords: archiveteam; webcrawl
Downloads: 9
[unknown]Nwnet.co.uk user pages
A backup of http://www.nwnet.co.uk/ on 2013-04-30.
Keywords: webcrawl; archiveteam
Downloads: 7
[unknown]Internet Census 2012
A grab of http://internetcensus2012.bitbucket.org/ the Internet Census 2012. Port scanning /0 using insecure embedded devices
Keywords: webcrawl; archiveteam
Downloads: 49
[unknown]IGN coverage of the Lord of the Rings Online game
A backup of http://lotrovault.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 19
[software]A backup of pix.toile-libre.org
This is a grab including database dumps of pix.toile-libre.org from early 2013. Toile Libre is an image hosting service based in France.
Keywords: archiveteam; webcrawl
Downloads: 16
[software]Release ISO images of Slackware 12.2
The release ISO images of Slackware 12.2 including sha256 sum files.
Keywords: webcrawl; archiveteam
Downloads: 33
[unknown]Asheron's Call Vault
A last minute grab of http://acvault.ign.com/ before it shuts down.
Keywords: webcrawl; archiveteam
Downloads: 25
[software]Release ISO images of Slackware 13.0
The release ISO images of Slackware 13.0 including sha256 sum files.
Keywords: webcrawl; archiveteam
Downloads: 32
[unknown]Gamespy Arcade site
A backup of http://gamespyarcade.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 12
[software]Release ISO images of Gentoo 2012
The release ISO images of Gentoo 2012 including sha256 sum files.
Keywords: webcrawl; archiveteam
Downloads: 43
[unknown]IGN site for gamespy software
A backup of http://play.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 8
[unknown]A list of all blip.tv video urls 2013-10-09
All the video urls collected from the sitemaps on 2013-10-09. bliptv_video_urls.txt.gz - Contains all the urls to directly download a video file. These were extracted from the xml sitemaps blip.tv uses. 228,133 urls total. bliptv_sitemaps_2013-10-09.tar - Contains all the sitemaps gz compressed that were scanned to make the video url list. These files also contain metadata about the videos themselves.
Keywords: archiveteam; webcrawl
Downloads: 36
[unknown]IGN coverage of the City of Heroes game
A backup of http://cohvault.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 3
[unknown]IGN coverage of the Grand Theft Auto games
A backup of http://gta.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 8
[unknown]Gamasutra The Art & Business of Making Games
A last minute grab of http://www.gamasutra.com/ before it shuts down.
Keywords: webcrawl; archiveteam
Downloads: 36
[unknown]tweetfilm.net
A full site grab of tweetfilm.net
Keywords: webcrawl; archiveteam
Downloads: 23
[unknown]IGN mac website for mac based games
A backup of http://sbvault.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 83
[unknown]IGN coverage of the Super Smash Bros game
A backup of http://supersmashbros.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 7
[unknown]IGN coverage of the ces2009
A backup of http://ces2009.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 3
[software]Electronic Arts software patches and addons (May 2012)
A backup of http://ftp.ea.com on 2012-05-06. 34 GB of patches and addons for Electronic Arts games.
Keywords: webcrawl; archiveteam
Downloads: 84
[unknown]User pages from the Monmouth ISP
This is the collection of all the user accounts hosted on the Monmouth Internet provider. Active since 1995.
Keywords: webcrawl; archiveteam
Downloads: 23
[unknown]TV Tattle
A backup of http://tvtattle.com since it is shutting down.
Keywords: webcrawl; archiveteam
Downloads: 29
[unknown]IGN best of site - Irish version
A backup of http://ie.bestof.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 9
[unknown]Panic grab of all AOL Music sites
AOL announced they are shutting down AOL Music and all associated sites. This is a panic grab.
Keywords: webcrawl; archiveteam
Downloads: 9
[software]Release ISO images of Slackware 3.2
The release ISO images of Slackware 3.2 including sha256 sum files.
Keywords: webcrawl; archiveteam
Downloads: 64
[software]Release ISO images of Gentoo 11.2
The release ISO images of Gentoo 11.2 including sha256 sum files.
Keywords: webcrawl; archiveteam
Downloads: 24
[unknown]IGN coverage of the Test Drive Unlimited game
A backup of http://tdu.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 45
[unknown]A backup of Game Developer Conference websites
This is a panic grab of the Game Developer Conference website and related properities from early 2013.
Keywords: archiveteam; webcrawl
Downloads: 9
[unknown]IGN coverage of the Titan Quest game
A backup of http://titanquestvault.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 21
[unknown]IGN subscribers download page
A backup of http://s.insiderdownloads.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 7
[unknown]IGN app hosting site
backup of http://touch.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 6
[unknown]IGN site
A backup of http://sts.ign.com due to the pending site shutdown.
Keywords: webcrawl; archiveteam
Downloads: 14
[1] 2 3 4 5 6     Next    Last
Advanced search

Group results by:

> Relevance
Mediatype
Collection

Related creators

Internet Archive

Related mediatypes

web
software
texts
collection

Terms of Use (31 Dec 2014)